Redundant Hash Addressing for Large-Scale Query by Example Spoken Query Detection

نویسندگان

  • Afsaneh Asaei
  • Dhananjay Ram
  • Hervé Bourlard
چکیده

State of the art query by example spoken term detection (QbE-STD) systems rely on representation of speech in terms of sequences of class-conditional posterior probabilities estimated by deep neural network (DNN). The posteriors are often used for pattern matching or dynamic time warping (DTW). Exploiting posterior probabilities as speech representation propounds diverse advantages in a classification system. One key property of the posterior representations is that they admit a highly effective hashing strategy that enables indexing the large archive in divisions for reducing the search complexity. Moreover, posterior indexing leads to a compressed representation and enables pronunciation dewarping and partial detection with no need for DTW. We exploit these characteristics of the posterior space in the context of redundant hash addressing for query-by-example spoken term detection (QbE-STD). We evaluate the QbE-STD system on AMI corpus and demonstrate that tremendous speedup and superior accuracy is achieved compared to the state-of-the-art pattern matching and DTW solutions. The system has the potential to enable massively large scale query detection.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

STD Method Based on Hash Function for NTCIR11 SpokenQuery&Doc Task

In this paper, we describe a spoken term detection (STD) method which is used in Spoken Query and Documents task of NTCIR-11 meeting. Our STDmethod extracts sub-sequences from the syllable-based speech recognition candidates of the target speech and converts them into bit sequences using a hash function. The query is also converted into a bit sequence in the same way. Term detection candidates ...

متن کامل

Use of GPU and Feature Reduction for Fast Query-by-Example Spoken Term Detection

For query-by-example spoken term detection (QbE-STD) on low resource languages, variants of dynamic time warping techniques (DTW) are used. However, DTW-based techniques are slow and thus a limitation to search in large spoken audio databases. In order to enable fast search in large databases, we exploit the use of intensive parallel computations of the graphical processing units (GPUs). In thi...

متن کامل

Addressing the out-of-vocabulary problem for large-scale Chinese spoken term detection

While the Out-Of-Vocabulary (OOV) problem remains a challenge for English spoken term detection tasks, it is underestimated for Chinese. This is because an Chinese OOV query term can still be matched as a sequence of Chinese characters, with each character itself being a word in the vocabulary. However, our experiments show that search accuracy levels differ significantly when a query is or is ...

متن کامل

Spoken Content Retrieval Using Distance Combination and Spoken Term Detection Using Hash Function for NTCIR10 SpokenDoc2 Task

In this paper we describe a spoken content retrieval (SCR) and a spoken term detection (STD) which were used in the 2nd round of the IR (Information Retrieval) for Spoken Documents (SpokenDoc2) task. Our SCR method maps the target documents into multiple vector spaces, which include a word-based vector space for word-based speech recognition results and a syllable-based vector space for syllabl...

متن کامل

NTU System at MediaEval 2015: Zero Resource Query by Example Spoken Term Detection using Deep and Recurrent Neural Networks

This note serves as a documentation describing the methods the authors of this paper implemented for the Query by Example Search on Speech Task (QUESST) as a part of MediaEval 2015. In this work, we combined DTW, DNN and RNN in one framework to perform query by example spoken term detection in a zero resource setting.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016